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SUMMARY 
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Let 


H 


,ir.  be  given  populations  associated  with  ufiknown-real 

Is  .■  i 

'  / 

parameters  84  ,...,9^.  is  assumed  to  be  *goodtf  if  ^  >  e^,  where 

"9^  €  1R  is  a  given  control  value,  i  =  l,...,k.  The  goal  is  to  find  the 
"best*'  population  (i.e.  that  one  with  the  largest  parameter),  if  it  is  ' 

•good-,  in  2  stages  with  screening  out  •bad1*  populations  in  the  first  stage. 
Consideration  is  restricted  to  permutation  invariant  procedures.  It  is 
shown  that  under  MLR  and  a  general  invariant  loss  structure  the  natural 
final  decisions  are  optimum.  More  generally  an  extension  of  the  "Bahadur- 
Goodman  Theorem"  to  sequential  settings  (with  and  without  relation  to  a 
control)  is  derived.  If  the  loss  structure  consists  of  the  cost  for  sampling 
plus  the  loss  for  final  decision,  it  is  shown  that  for  every  symmetric  prior 
there  exists  a  Bayes  procedure  which  selects  at  the  first  stage  populations 
according  to  the  largest  observations.  Natural  procedures,  which  screen  out 
with  the  UMP  test  for  H:  9  <_  8^  versus  K-  e  >  ff^at  fixed  level  a,  are 
considered.  As  an  example,  all  results  are  studied  in  the  special  case  of 
normal  populations  with  unknown  means  and  a  common  known  variance. 
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1.  Introduction.  Let  ir.|,...,irk  be  k  populations  associated  with  unknown 
parameters  e-j,...,ek  €  F.  Let  6q  €  a  be  a  given  control  value  such 
that  every  it.,  with  ei  >  eQ  is  assumed  to  be  "good",  and  "bad"  otherwise, 
i  =  l,...,k.  We  consider  the  problem:  how  to  find  the  best  population 
(i.e.  that  one  associated  with  the  largest  parameter)  among  the  good  ones 
(if  there  is  any)  in  two  stages  by  screening  out  non-best  (or  bad)  populations 
in  the  first  stage. 

Assume  that  samples  {X,. ,}. ,  „  and  {Y..}.  ,  _  can  be  drawn 

from  it.  at  the  first  and  the  second  stage,  respectively,  i  =  l,...,k,  which 
are  mutually  independent.  Let  U..  and  be  real-valued  sufficient  statistics 
for  e.  with  respect  to  these  samples  which  have  densities  f  and  g  , 

i  °-j 

respectively,  with  respect  to  the  Lebesgue  measure  on  S  ,  i  s  l,...,k. 

The  families  {fg}^  and  {g0}een  are  assumed  to  be  known  and  to  have  monotone 

non-decreasing  likelihood  ratios  (MLR).  Finally,  let  =  T(U..,V.)  be  a 

real-valued  sufficient  statistic  for  with  respect  to  (U^ ,V^),  which  has 

a  density  h.  with  respect  to  the  Lebesgue  measure  on  F,  where  the  family 
ei 

a1so  has  MLR'  For  notationa1  convenience,  let  U  =  (U^,...,Uk),  and 
let  V,  W  etc.  have  analogous  meaning. 

In  this  paper  we  will  study  a  certain  class  of  2-stage  procedures  (S,d), 
defined  as  follows.  Let  S  denote  any  subset  selection  procedure  based  on 
U,  i.e.  S:  Fk  +  {s|s  c  {l,...,k}}  measurable  with  respect  to  Borel  sets 
in  F^,  where  an  empty  subset  is  admitted.  S  acts  as  a  screening  procedure 
in  the  first  stage.  Let  d  =  { d$ >s  c  ^  >kj  with  d0  =  0  and  d^  =  i, 

1  »  1 ,. . .  ,k.  Moreover,  for  every  sc  {1 ,. . .  ,k}  with  size  |s|  >_  2,  let 
k  k 

d$:  F  x  F  -*•  s,  where  d$(u,v)  depends  only  on  variables  u^  and  v^  with 
1  e  s,  and  where  d$  is  measurable  with  respect  to  the  Borel  sets  in  their 
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joint  space  IR^SL  d  represents  the  set  of  final  decisions  at  the  first 
stage  and  the  second  stage,  respectively.  The  introduction  of  the  (at  the 
first  sight)  somewhat  complicated  looking  structure  d  will  prove  to  be  very 
convenient  in  the  sequel.  Now  we  are  ready  to  define  our  2-stage  procedures 
in  a  concise  way. 

Definition  1.  2-staqe  procedure  (S,d). 

Stage  1:  Take  observations  (i.e.  the  X-samples)  from  it-j ,. . . ,1^.  Select  all 
populations  it.  with  1  6  s  =  S(U).  If  s  =  0,  stop,  and  decide  =  0  (i.e. 

"no  population  is  good").  If  s  *  (i)  for  some  i  €  U,...,k},  stop,  and  decide 
d^j  =  i  (i.e.  "ir.j  is  good  and  the  best  one").  If  |s|  proceed  to  Stage 
2. 

Stage  2:  Take  additional  observations  (i.e.  the  Y-samples)  from  all  populations 

it,  with  i  €  s  and  make  the  final  decision  dc(U,V)  (i.e.  "it.  is  good  and  the 
1  s  —  ‘o 

best  one",  if  ds(U,V)  =  iQ,  say,  for  some  iQ  €  s.). 

Throughout  this  paper  we  will  restrict  consideration  to  procedures 
(S,d)  which  are  completely  (i.e.  with  respect  to  both,  S  and  d)  invariant 
under  permutations  of  the  k  populations  "■) »•  •  •  ,ir|(. 

In  Section  2  it  will  be  shown  that  under  any  reasonable  loss  structure 
the  optimal  final  decisions  are  always  the  natural  ones,  i.e.  are  associated 
with  the  largest  sufficient  statistic  among  those  coming  from  the  populations 
which  still  are  eligible.  This  result  can  be  derived  from  Lehmann’s  (1966) 
version  of  the  "Bahadur-Goodman-Theorem".  In  Section  3  a  natural  type  2- 
stage  procedure  will  be  studied  which  screens  out  in  the  first  stage  by 
means  of  an  UMP-test  ("0  <  e0"  versus  "0  >  0O")  at  a  fixed  level,  which  is 
applied  separately  to  Uj,...,Uk,  respectively.  Finally,  in  Section  4  it 
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will  be  shown  that  under  a  fairly  general  loss  structure  (cost  for  sampling 

plus  loss  for  final  decision)  and  for  every  symmetric  prior  there  exists  a 

Bayes  2-stage  procedure  which  is  completely  monotone  (i.e.  where  also  the 

subset  selections  are  made  in  terms  of  the  largest  observations),  provided  that 

a  certain  condition  (Assumption  (A)  or  (B))  is  satisfied.  This  result 

will  be  derived  by  a  two-fold  application  of  Eaton's  (1967)  more  general 

version  of  the  "Bahadur-Goodman-Theorem".  Throughout  the  following  we  shall 

repeatedly  study,  as  an  example,  the  case  of  fc  normal  populations  with 

? 

unknown  means  6-| . . . .  and  a  common  known  variance  a  >  0. 
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.  Optimality  of  the  natural  final  decisions.  In  this  section  we  assume 


that  a  loss  structure  is  given  which  we  will  specify  only  with  respect  to 
final  decisions,  and  without  reference  to  the  control  0g.  This  allows  us 
to  state  the  results  in  a  more  general  setting  including  also  the  non-control 
("finding  the  best  population")  problems  such  as  those  studied  by  Tamhane 
and  Bechhofer  (1979). 

Definition  2.  Loss  structure  L. 

Let  us  assume  that  for  every  procedure  (S,d)  subsequent  decisions  S  =  s 

and  ds  =  i,  i  €  s,  result  in  a  real-valued  loss  L(s,i,e)  at  e  =  (e1 . ek)  €  n 

which  is  integrable  and  has  the  following  two  properties: 

(a)  L  is  permutation  invariant  (i.e.  LUs.iri.ne)  =  L(s,i,e)  in  the  sense 
of  Eaton  (1967)  for  all  permutations  ir),  and: 

(b)  For  every  e  e  nk  and  i.j  €  {!,...  ,k}  with  ei  <  e^,  L({1},i,e)  >  L({j},j,e); 
and  s  c  (1 ,. .. ,k}  with  i.j  €  s  implies  L(s,i,e)  ^  L(s,j,0). 

The  risk  of  a  procedure  (S,d)  at  e  6  n  is  given  by 

(1)  r0(S,d)  -  E0(L(S(U) ,  ds(u)(u,v),  e)) 

*  l(M,0)p0{S(u)  =  0} 

k 

+  l  L({i },i ,0)P0{S(y)={i } | |S(u) |  =1}  Pg{|S(y)|=l) 

+  l  (  l  L(s,i,0)P  (d  (U,V)=i|S(U)=s})P.{S(U)=s}. 
s,Ts|>2  ils  -  0  s  -  -  8  - 

Our  first  result  is  with  respect  to  final  decisions  at  Stage  1. 

Lemma  1.  Let  (S,d)  be  a  2-staqe  procedure  and  let  S  be  the  same  procedure 


as  S  with  the  only  modification  that  for  all  u  6  IRk  with  |S(u)|  =  1, 
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S(u)  =  {i}  implies  u*  =  max  u„.,  i  £ 

1  j=l,...,k  3 

Then  r\(S,d)  <  r„(S,d)  for  all  6  £ 

y  y  —  - 

Proof:  For  a  fixed  (S,d),  let  A  =  {u  €  IRk|  |  S(u)  |  =  11.  Let  e  £ 

with  P_{U  6  A>  >  0.  The  conditional  distribution  of  U,  given  U  €  A,  has  the 
0  —  ~  “ 

following  density  w.r.t.  the  Lebesgue  measure 

1  k  k 

(2)  P.{U  6  A}  1  n  f  (u.)  IA(u),  u  e  IRK. 
e  -  i=1  ei  i  a 

Since  by  the  invariance  of  S,  A  is  permutation  symmetric  and  moreover, 

L 

PQ{U  €  A}  is  a  permutation  invariant  function  of  e  £  n  ,  (2)  is  of  the 

form  assumed  in  Lehmann  (1966).  Also,  L({i),  i,0)  satisfies  the  monotonicity 

property  (5)  of  Lehmann  (1966).  Thus  by  his  main  result  the  first  sum  in 

(1)  is  for  (S,d)  smaller  or  equal  to  that  one  for  (S,d).  Since  all  other 

terms  in  (1)  are  the  same  for  both  procedures  the  proof  is  completed. 

The  corresponding  proof  with  respect  to  final  decisions  at  Stage  2 

uses  basically  the  same  idea,  but  the  analysis  turns  out  to  be  a  bit  more 

complicated.  For  simplicity,  let  us  assume  from  now  on  that  the  mapping 

(u,v)  (u,T(u,v))  for  (u,v)  in  the  interior  of  &  =  u  (support(ffl)xsupport(g  )) 

0€fl 

is  one-to-one  and  continuously  differentiable.  Thus  we  have  a  function  T  with 
(u , v )  =  (u,T (u,T(u,v))),  (u,v)  6  &  with  analogous  properties. 

k  k 

Lemma  2.  For  every  e  e  n  and  every  permutation  symmetric  Bore!  set  Ac  F 

with  P0{U  €  A}  >  0,  the  conditional  distribution  of  W,  given  u  €  A,  has  a 

density  w.r.t.  the  Lebesgue  measure  of  the  type 

k  .  k 

c(e)  n  h  (w.)  p(w),  w  €  IR  , 

"  i=l  ei  1 


(3) 
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f 

i 


k 

where  c:  a  ■*  F  +  =  { c| C  i  0}  is  permutation  invariant,  h0:  F  F+  is 

k  v 

measurable,  p:  F  -*•  F+  is  measurable  permutation  invariant,  and 

{hglggj.j  has  MLR. 


k  k 

Proof:  Let  9  €  (2  and  Ac  F  be  given  as  stated  above.  Then  the 
conditional  distribution  of  (U,V),  given  U  €  A,  has  the  following  density 
w.r.t.  Lebesgue  measure. 


-1  k  k 

(4)  Pfl{U  €  A)  n  f  (u,)g0  (v.)  Ifl(u),  u,v  e  F  . 

e  .=1  0.10.1  a  - 

Since  W.  =  T(U^ ,V* )  is  sufficient  for  e^,  i  =  l,...,k,  by  the  factoriza¬ 
tion  theorem  there  exist  non-negative  measurable  functions  h0  and  G  with 
f0(u)g0(v)  =  hQ(T(u,v))G(u,v),  u,v  €  F ,  0  €  £2.  After  inserting  this  into 
(4)  and  after  a  standard  change  of  variables,  we  see  that  the  conditional 

distribution  of  (U,W)  with  =  T(U^,V^),  i  =  1 . k,  given  U  e  A,  has 

the  density 


(5)  P0{U  €  A}' 


k  . 

n  h  (w. )G(u. ,T(u. ,w. ) ) 
1=1  ei  1  1  11 


BTfu^.) 


3W . 


IA(U),  u.w  e  f 


If 

Thus  by  Integrating  out  the  variables  u  €  F  we  see  that  the  conditional 
distribution  of  W,  given  U  €  A,  has  a  density  of  the  form  (3)  w.r.t.  the 
Lebesgue  measure.  Here  c(0 )  =  P0(U  €  A}"^  which  is,  as  we  know  already, 
permutation  invariant.  Moreover, 


(6)  p(w)  =  n  G(u^  ,T (u^  ,w1 )) 


aTCu^w^ 


3W. 


du,  w  €  F 


which  likewise  is  permutation  Invariant.  Finally,  since  by  assumption  the 
family  of  densities  for  =  T(U^,V^),  i  «  l,...,k,  {h0)0€n,  has  MLR, 
{h.}-,.-  has  also  MLR.  This  completes  the  proof  of  Lemma  2. 

o 
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Corollary  1 .  For  every  e  e  n  ,  permutation  symmetric  Bore!  set 

A  c  lRk  with  P0{U  6  A}  >  0,  and  s  =  { i i , . . .  ,i c  { 1 , . . .  ,k) ,  the  condi - 

tional  distribution  of  (W..  ,...,W_.  ),  given  U  6  A,  has  a  density  w.r.t.  the 

nl  \ 

Lebesgue  measure  of  the  type 


t  '  t 

c ( 0 )  n  h  (e.) c  €  IR\ 
-  j=i  .  J  §  * 


where  c  and  {h„}^  are  the  same  as  in  (3) ,  8 '  =  (e,  , . . .  ,9  .  )  with 

e  een  -  jk_t 

{i, u  { j i , . . .  =  {1 ,. . .  ,k},  and  P0 .  (§)  is  permutation  invariant 

in  e'  as  well  as  in  §  €  1R t. 

Proof:  This  follows  from  Lemma  2  by  integrating  out  in  (3)  the  variables 

w.  , ...,w.  .  Especially  thereby  we  get  for  (w.  ,  ...,w .  )  6  R*" 

J1  Jk-t  M  ’t 

k-t  „ 

(8)  p  -(w.  ,...,w.  )=  /  n  h  (w.  )p(w-  > . . . >w .  ,w.  > . . . >w •  ) 

-  '1  \  ^  k-t  r=l  ejr  Jr  M  't  J1  Jk-t 

d(w .  ,...,w,  ), 


which  now  can  be  seen  to  have  the  invariance  properties  stated  above.  Thus 
the  proof  is  completed. 


Now  we  are  ready  to  prove  the  main  result  of  this  section. 


Theorem  1 .  Let  (S,d)  be  a  2-stage  procedure  and  let  S  be  the  modification 
of  S  (given  in  Lemma  1)  which  uses  the  optimal  final  decision  at  Stage  1.  Let 
d*  =  {d*}e  ,  be  the  set  of  natural  final  decisions,  i.e.  where  for 

S  S  ^  1  I  )  •  •  •  ’  ’ 

b 

every  sc  {1 ,. . .  ,kl  with  |s|  >_  2,  i  e  s,  u,  ye  IR  and  d*(u,v)  =  i  implies 

T(uj,v.)  =  max  TCu.jvJ.  Then  r  (S,d*)  <  r  (S,d)  for  all  9  6 
s  ic<-  J  J  2  2 


Lr 

Proof;  Let  0  G  f!  be  fixed.  In  view  of  Lemma  1  we  only  have  to  show 
that  ra(S,d*)  <  rQ(S,d).  Thus  by  (1)  it  suffices  to  prove  that  for  every 

U  —  0 

s  c  {1 , . . .  ,k}  with  | s |  >_  2  and  P0{S(U)  =  s)  >  0, 

(9)  l  L(s,i,0)P  (d*(u,v)  =  i |S(U)  =  s} 

iGs  - 

<  l  L(s,i ,e)P  {d  (U,V)  =  i[S(U)  =  s). 
iGs  - 

Let  sc  {1 , . . . ,k}  with  js j  >_2  be  fixed.  Let  A  =  {u  G  IR ^ | S (u )=s > . 

By  the  invariance  property  of  S,  A  is  permutation  symmetric.  In  the  conditional 

situation,  given  S(U)  =  s  or,  equivalently,  given  U  G  A,  W  is  sufficient  for 
k 

0  €  n  .  This  can  be  seen  from  (4)  and  the  sentence  following  (4).  Thus,  similar 

as  one  concludes  in  the  theory  of  selection  procedures,  if  s=  {i] . it> 

with  1  £  1-j  <...<  it  £  k,  say,  then  we  can  assume  that  ds(u,v)  is  a  function 
of  (T(u.  , v .  ),..., T(u,  ,V.  )).  By  Corollary  1  the  conditional  distribution 

T1  ’l  h  h 

of  (W.  , . . . ,W-  ),  given  U  g  A,  has  a  density  w.r.t.  the  Lebesgue  measure  of 
’l  \ 

the  form  (7)  or,  respectively, 

t  ~  t 

(10)  c„,  (e.  . e,  )  n  h  (Ci  )pe .  (§) »  §6  H  , 

2  nl  ’t  j=l  9i .  J  - 

J 

where  c„,:  n*  -►  F.  and  pQl:  IRt  IR,  are  permutation  invariant  functions, 
p0,  is  measurable  and  {h0 has  MLR.  Therefore  this  distribution  satisfies 
all  conditions  assumed  by  Lehmann  (1966).  Since  moreover,  L(s,i,e),  i  G  s, 
satisfies  the  condition  (5)  in  his  paper,  it  follows  from  his  version  of  the 
"Bahadur-Goodman  Theorem"  that  inequality  (9)  holds.  This  completes  the  proof 


of  the  theorem. 


Remark  1 .  Let  (S,d)  be  any  2-stage  procedure.  Then,  after  having  made 


a  decision  S=s,  say,  the  final  decision  d  =  i,  say,  can  be  viewed  as  being 
a  partition  (s  \{i},  { i } )  of  s  into  two  subsets  of  sizes  js|-l  and  1, 
respectively.  If,  more  generally,  partitions  into  q  subsets  of  s  of  fixed 

sizes  r^...,^  are  to  be  made,  where  q,r^ . rq  depend  on  |s|,  then  the 

more  general  version  of  the  "Bahadur-Goodman-Theorem"  by  Eaton  (1967)  can 
be  applied.  Thus,  if  the  loss  structure  in  this  setting  is  compatible  with 
the  one  as-'  ^ed  by  Eaton  (1967),  then  the  set  of  natural  partitions  in  terms 
of  the  ordered  W^'s  is  optimal. 

By  Theorem  1  we  know  now  especially,  that  after  having  made  a  decision 

S ( U )  =  s,  say,  it  is  always  better  to  make  a  final  decision  in  terms  of  the 

largest  W.  among  the  W-  with  i  e  s,  than  to  make  it  in  terms  of  the  largest 

V.  among  the  V.  with  i  6  s.  This  fact  appears  to  be  interesting  enough  to 
n0  1 

be  formulated  in  a  slightly  more  general  form  in  the  following  Corollary  2. 

It 

Corollary  2.  For  every  e  €  ,  sc  {l,...,kl  and  every  permutation 

k 

symmetric  Borel  set  Ac  IR  with  PQ { U  cz  A}  >  0  the  following  holds  true. 

Let  e.  =  P„{W,.  =  max  WjU  €  A}  and  f.  =  P„{ V -  =  max  V,.},  i  €  s.  Then  the 
i  5  i  jes  J  -  i  §  i  j6s  j 

e^ 's  and  f^'s  are  ordered  in  the  same  order  as  the  6^'s  with  i  6  s  and, 

moreover,  the  vector  of  e^ ' s  majorizes  the  vector  of  f ■ ' s . 

Proof:  Without  loss  of  generality,  let  s  =  {l,...,t}  with  t  >  2  and 
k  k 

e  €  n  with  <_.  e^.  If  Ac  IR  has  the  properties  stated  above,  take 

any  permutation  invariant  S  with  S(u)  =  s  if  u  e  A  and  with  |S(u)|  <  1 

otherwise.  Let  r  <s  { 1 , . . . , t- 1 }  be  fixed  and  take  any  loss  structure  L  with 

L(s,i,e)  =  1(0)  if  i  £  (>)r,  i  e  s.  Let  d  (u,v)  =  i  if  v,  s  max  v.,  i  e  s, 

j€s  J 

where  ties  are  broken  at  random.  Then  by  Theorem  1  we  get  r@(S,d*)  £  r0(S,d) 


or,  more  specifically,  by  inequality  (9)  we  get  fr+1  +...+  ft  £  er+1  +...+  et, 
since, obviously.we  have  f1  +...+  ft  =  e^  +...+  et  =  1.  Moreover,  that 

£•••!  holds  is  well  known.  Finally,  e-|  £..._<  et  follows  from  Corollary  1 
and  Lemma  4.1  of  Eaton.  Thus  the  proof  is  completed. 

Example  (Normal  Case):  Let  us  look  at  the  special  case  where 

2  2 

are  normal  populations  N(e1,a  ),. . . ,N(ek,a  )  with  unknown  means  e^,...,ek  e  1R 

2 

and  a  common  known  variance  a  >0.  Let  and  be  the  arithmetic  means  of 

the  observations  in  samples  {X..}.  ,  and  {Y..}.  ,  ,  respectively, 

ij  j**  i  >  •  •  •  j n  1  j  j- 1 , . . .  ,in 

i  =  l,...,k.  In  several  parts  of  this  paper  we  shall  return  to  this  special 

case  which  henceforth  will  be  denoted  as  the  normal  case. 

Thus  we  have  U.  'b  N(ei ,p)  and  V..  ^  N(e^,q),  i  =  l,...,k,  which  are 

2  2 

mutually  independent,  where  p  =  a  /n  and  q  =  o  /m.  Let  be  the  overall 
arithmetic  mean  for  it.,  i  =  l,...,k.  Then  =  TfU^.V^)  =  (qU^.+pV^ )/(q+p)  ^ 

N(ei  ,(q_1+p_1  )-1  a2),  and  V.  =  T (Ui  ,Wi )  =  Wi  +  p^qfW.-U.),  i  =  1 . k. 

Since  all  our  assumptions  concerning  the  underlying  distributions  are 
satisfied,  all  results  derived  so  far  are  valid  in  this  case.  And  from 
Corollary  2,  one  can  derive  interesting  inequalities. 

Remark  2.  Without  going  into  details  it  should  be  pointed  out  that 
analogous  results  to  the  ones  derived  in  this  section  can  be  obtained  in 
more  general  sequential  settings,  provided  that  the  stopping  rule  is 
permutation  invariant. 
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3.  A  natural  type  2-stage  procedure.  In  this  section  we  will  study 

2-stage  procedures  (S,d)  from  a  non-decision  theoretic  point  of  view.  Let 

a  correct  decision  (CD)  at  8  €  ftk  be  d  =  0  (i  .e.  S  =  0)  if  9i » —  °0’ 

and  be  d  =  i  if  e,  =  max  e.  >  8..,  otherwise.  Let  us  assume  that 

1  j=l , . . .  ,k  J  u 

the  experimenter  wishes  to  have  a  procedure  (S,d)  which  at  Stage  1  has  a 

small  expected  number  of  selected  bad  populations,  denoted  by  E0(Nb)  (a 

small  expected  overall  sampling  amount  or  a  small  similar  measure  of 

economical  performance),  and  a  large  probability  of  a  correct  selection 

pQ(CD)  at  points  ee  (l  where  max  e.  >  ert,  subject  to  the  basic, 
e  -  j=l,...,k  J  0 

P*-condition  inf { PQ (CD) j e  e  ft  ,  1  9gl  1  p*>  where  p*  is  a  pre¬ 

specified  constant  with  0  <  P*  <  1 . 

The  following  procedure  may,  sometimes.be  applied  in  practice.  The 

experimenter  takes  the  UMP-test  for  H:  9  <_  eQ  versus  K:  e  >  oQ  at  level 
1  /  k 

a  =  1-P*  and  selects  all  populations  ir.  with  are  shown  to  be  significantly 
good  by  statistics  Ui ,  i  =  l,...,k.  His  final  decision  may  be  the  natural 
one  based  on  the  V^'s  associated  with  the  populations  which  are  selected  at 
Stage  1.  From  Corollary  2  it  follows  that  this  procedure  can  be  improved 
with  respect  to  P0(CD)  without  any  changes  in  the  expected  number  of 
selected  good  populations  E.(N  ),  ED(N.)  and  PQ{S(U)  =0}.  This  procedure 

\j  9  “  D  b  “ 

p  will  be  studied  now  in  more  detail.  For  convenience,  let  us  define  it 
without  using  the  terminology  of  hypothesis  testing. 

Definition  3.  Procedure  p.  Let  p  be  the  2-stage  procedure  (S,d*)  with 

S(u)  =  { i | >  a,  i  =  1, _ ,k},  where  a  e  R  is  determined  by 

(Ui  1  a}k  =  P*. 

That  p  satisfies  the  basic  P*-condition  follows  from  the  fact  that 
is  stochastically  non-decreasing  in  e  ft,  i  =  1 ,. . .  ,k,  which  in  turn 
is  a  well-known  consequence  of  the  MLR  property  of 


V 
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In  the  next  two  steps  we  establish  formulas  for  the  distribution  of 
final  decisions  under  p  and  derive  a  basic  monotonicity  property. 

If 

Theorem  2.  For  every  e  €  ft 

"  k  k 

In  P  {U.  <  a}  =  n  F  (-<*>),  i  =  0, 
j=l  ej  J  “  j=l  ej 

k 

/  n  F „  (x)dF  (x),  i  =  l|.t.(ki 
R  j=l  0j  0i 

where  for  er  e  n,  x  €  R  u  {-“>»  r  *  l,...,k, 

02)  -  E6r[i(.„>a](ur)  ♦  0-i(..,a](urm(.„,x](“r)]. 

Proof;  For  r  e  k}  take  the  improper  random  variable  defined  by 

Zr  =  -®  (Wr)  if  Uf  £  (>)a,  which  obviously  has  the  distribution  function 

F  (x),  x  €  R  u  {-”>•  Now,  for  i  =  1 . k  we  have 

9r 

(13)  {Z.  =  max  Z.f  and  Z_.  >  -  «} 

1  j«l,...,k  J  1 

=  {Wj  =  maxIW.jlh  >  a,  j=l,...,k},  and  >  a} 

I  J  v 

•  «{<y)<!!.!!>  ■  <>• 

Therefore,  in  view  of  the  independence  of  Z^...^,  (11)  follows  for 
i  =  l,...,k.  The  proof  of  (11)  for  i  =  0  is  straightforward. 

Theorem  3.  {F^^,  as  given  in  (12),  is  a  stochastically  non-decreasing 

family  of  distribution  functions  on  R  u  {-«). 

Proof:  Let  X  6  R  u  {-<*>}  be  fixed  and  let  Ha  ,  be  an  auxiliary  function 

"  a  *  A 


defined  by 
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(14)  Ha>x(u,v)  =  ^1'I(-oo,a]^u^^1'I(-«>,x]^T^u’v^’  (u,v)  € 

By  the  assumptions  made  in  Section  1  we  can  assume  that  T(u,v)  is  a  non- 
decreasing  function  in  u  as  well  as  in  v,  (u,v)  G  fl.  Thus  H  ,  has  the 
same  monotonicity  properties.  Now  by  =  T(U-|,Vi)  and  (12)  we  have 

(15)  F0i(a)  =  l-Eei[0-I(„>a](U,))0-I(.„)](H1))] 

'  ,-E0]tHa,»(U)-Vl»’  81  6  “• 


Since  U-j  and  V-j  are  stochastically  non-decreasing  in  e-j  6  S2  and 
independent,  E„  [H  , (U,,V,)]  is  non-decreasing  in  0,  G  fi.  This  follows 

0 1  a  y  A  I  I  I 

from  Lehmann  (1955).  Thus  the  proof  is  completed. 


From  Theorems  2  and  3  several  desirable  properties  of  procedure  P 
can  be  derived.  Properties  1-4  can  be  proved  with  standard  techniques 
(especially  integration  by  parts)  from  single  stage  selection  theory.  The 
masses  in  {-°°}have  to  be  taken  into  consideration,  but  they  cause  no  serious 
problems.  Thus,  we  omit  the  proofs  for  brevity. 


Property  1 :  For  every  i  6  k},  pe^ds(u)(y»y)  =  1  *  is  non_ 

•  •  k 

decreasing  in  0.  and  non-increasing  in  e .,  j  ^  l ,  0  G  n  • 

'  J 

Property  2:  For  every  0  G  with  e-j  <. ..<  0k>  pe{ds(u)(y»y)  =  1 ) 
is  non-decreasing  in  i  G  k}. 

Property  3;  For  every  non-empty  set  Me  k),  P0{ ds(U) € 

*  k 

is  non-decreasing  (non- increasing)  in  e..  with  i  G  M(i  £  M),  e  G  n  • 

Property  4:  E0 (Ng )  (E0(Nb))  is  non-decreasing  (non-increasing)  in 

If 

0^  with  0  j  ^  9q  (0^  £  0q)»  ^  =  ^  *  * "  *^*  0  €  ft  • 
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Property  5;  For  every  e  e  nk  with  e1,...,e(c  <  0Q,  PQ{S(y )  =  0}  tends 
to  1  for  large  n.  For  every  e  e  n  with  ej,...,ek  t  <  eQ  <  »•  •  • . 

°k-l  *  V  t  €  n,...,k),  P0{S(U)  =  {k-t+1 . k},  d^g^U.V)  =  k)  tends  to 

1  for  large  n  and  m. 


Proof:  The  first  assertion  follows  from  the  well  known  consistency 
properties  of  the  UMP  test  mentioned  at  the  beginning  of  this  section. 

For  8  6  flk  with  01,...,ek_t  <  0g  <  Vt+i*"*»ek-l  <  ek’  t  €  k),  by 

the  same  reasons,  P0{S(U)  =  (k-t+1 . .  ,k}}  tends  to  1  for  large  n.  Since, 

moreover,  P_(W.  -  max  W.}  tends  to  1  for  large  n+m  (see  Miescke 
-  *  j-l,...,k  J 

(1979a)),  the  proof  is  completed. 


Next  we  like  to  show  along  the  lines  of  Tamhane  and  Bechhofer,  in 
an  informal  way  of  proof,  that  procedure  P  is  preferable  to  the  corresponding 

1-stage  procedurePg,  say,  from  an  economical  point  of  view.  Let  U1 . Uk 

be  of  the  same  type  as  U-j,...,^,  but  based  on  samples  of  size  nQ  from  ir-| ....  ,1^. 
Then  pQ  decides  as  follows: 

19  if  U1,...,Uk  £  aQ 

{ 1 >  if  U.  =  max  U.  and  U.  >  an,  i=l,...,k, 

1  j=l,...,k  J  1  u 


where  aQ  is  determined  by  P0  {U^  <  ag)  =  P*.  The  version  of  PQ  in  the 
normal  case  was  studied  by  Bechhofer  and  Turnbull  (1978).  Now,  if  an  optimal 
allocation  of  observations  is  derived  subject  to  a  criterion  which  can  be 
met  by  the  use  of  monotonicity  properties  of  P0{final  decision  is  "i")  in 
1  ■  1,...,k,  then  the  allocation  problem  has  to  be  solved  for 

|r 

both,  p  andPg,  at  the  same  points  e  e  n  .  Since  then  pQ  can  be  viewed  to  be 
a  special  version  of  P  with  n  =  ng  and  m  =  0,  we  conclude  as  follows: 


,-t  „ 
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Property  6:  In  every  allocation  problem  subject  to  a  criterium  which 

can  be  met  by  the  use  of  monotonicity  properties  of  PQ{ f i nal  decision  is 

k 

"i" }  in  e,,...^,  §  6  n  ,  i  =  l,...,k,  p  is  at  least  as  economical  as  pQ. 

Finally,  let  us  consider  the  class <2-  of  procedures  which  are  of  the  same 
type  asp  but  use  another  level  a  test  at  Stage  1.  Then  by  the  properties  of 
the  UMP  level  a  test  we  get 

Property  7:  For  every  fixed  n,  m  and  a  (or  P*,  respectively),  p 
maximizes  (minimizes)  EQ(N  )  (^(N^))  within  the  class  <3,  uniformly  in 

tU.  '  ' 

To  summarize  the  results  so  far  derived,  and  especially  in  view  of 
Properties  4,  5  and  7,  p  appears  to  be  a  reasonable  procedure  if  the 
experimenter  wishes  to  screen  out  the  bad  populations  at  Stage  1,  to  keep 
the  good  ones  (if  there  are  any)  at  the  same  time,  and  finally  to  select 
the  best  population  (if  it  is  good). 

On  the  other  hand,  let  us  look  at  the  case  where  the  experimenter 
is  looking  for  the  best  population  (if  it  is  good)  but  wishes  to  keep  the 
expected  overall  sampling  amount  small.  Then  at  points  0  €  n  where  more 
than  one  population  is  good,  P  might  possibly  not  very  effectively  screen 
out.  Here  an  additional  screening  mechanism  seems  to  be  appropriate,  i.e. 
a  subset  selection  procedure  for  the  first  stage,  which  has  to  be  combined 
with  a  procedure  of  the  type  S  considered  so  far. 

In  the  normal  case,  analogous  to  what  Tamhane  and  Bechhofer  (1979) 
proposed  in  the  non-control  setting,  a  natural  choice  for  the  additional 
screening  mechanism  could  be  Gupta's  (1956)  maximum  means  procedure.  This 
leads  to  a  2-stage  procedure  PA  =  (SA,d*)  with 
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(17)  S.(u)  =  (i|u.  >  a.  and  u-  >  max  u^-p1^  a,  i  =  l,...,k>, 

i  a  i  .  ,  .  j 

J  I  |  •  yis 

where  a  >  0  Is  fixed  and  a.  has  to  be  determined  such  that  S.  meets  the 

—  a  A 

basic  P*-condition.  Note  that  for  A  =  0  (■»),  pa  is  of  the  type  Pg  (p). 

Since  we  again  have  enlarged  the  class  of  2-stage  procedures,  we  are 
led  to  a  more  economical  type  of  procedure  in  the  sense  of  Property  6. 

Is 

Moreover,  for  a  >  0  and  e  €  fl  with  en  <  max  e.,  the  probability  of 

u  i=i  k  J 

making  a  correct  final  decision  at  Stage  1  already,  tends  to  1  for  large  n. 
But,  on  the  other  hand,  p&  for  0  <  A  <  »  is  much  more  difficult  to  implement 
in  practice.  The  problems  arising  here  are  of  the  same  type  as  discussed  in 
Tamhane  and  Bechhofer  (1979),  Gupta  and  Miescke  (1980)  and  Miescke  and  Sehr 
(1980). 
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4.  Bayesian  2-stage  procedures.  From  now  on  let  us  assume  that  the 

parameters  0  =  (0-|,...,0k)  vary  randomly  according  to  a  permutation  invariant 

1/ 

prior  distribution  x  on  the  Borel  sets  of  fi  .  We  will  study  the  form  of 
Bayesian  2-stage  procedures  under  a  loss  structure  L  given  by 

10  if  s  =  0 

t(eo-0.)  if  s  =  {i) 

c | s j+n(0g-e. )  if  js|  1  2 

1/ 

i  =  l,...,k,  e  €  n  ,  where  c  >  0  is  a  constant  and  i:  F  -►  IR  is  non- increasing, 
integrable,  with  x(0)  =  0.  rh«r  overall  Bayes  risk  is  given  by 

k  / 

(19)  /  [  l  ft(eQ-e1)P «S(U)  =  {i}}  +  J  (c|s|  + 

J  1-1  5  s,|s|>2V 

+  l  n(eo-0i)P0{ds(y,V)=i|S(y)=s})p§{S(y)=s}]dT(e). 

By  Theorem  1  we  can  restrict  our  consideration  to  Bayes  procedures 
(SB,dB)  with  dB  =  d*  and  the  property  that  u  e  and  SB(u)  =  { i }  implies 

|r 

u.  =  max  u.,  i  =  l,...,k.  Therefore  at  every  point  u€  F  an  optimal 
1  j*1 ,k  J 

D 

subset  selection  procedure  S  decides  in  favor  of  a  subset  sc  {l,...,k} 
which  is  associated  with  the  smallest  of  the  values  given  in  the  following 


scheme. 


Posterior  risk  at  u  U  support  (fj 


EU(eft-o,)  |U=u>,  u,  =  max  u. 

0  1  '  ‘  1  j=l . k  J 

tc  +  E{  min  EUfO/j-O,  )|U=u,  V)  |  U=u>, 

J=1 . t  u  "  "  " 

1  <  <  k,  t  >_  2. 
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•  r . 

•  y 


A.-J* 


Note  that  in  the  last  expression  the  inner  conditional  expectation  is 
viewed  as  being  a  function  of  V,  and  that  the  outer  one  is  the  expectation 
with  respect  to  the  conditional  distribution  of  V,  given  U=u. 

Definition  4.  A  2-stage  procedure  (S,d)  is  called  monotone  if  (S,d)  = 
(S,d*)  in  the  sense  of  Theorem  1  and,  moreover,  if  for  every  u  €  F  , 
i,j  €  {l,...,k}  with  Uj  <  Uj,  i  €  S(u)  implies  j  €  S(u). 

Next  we  wish  to  find  sufficient  conditions  under  which  there  exist 

lr 

Bayes  2-stage  procedures  which  are  monotone.  For  this  purpose  let  u  € 

with  Uj  _<..._<  uk  and  t  6  (2 . k-1)  be  fixed.  In  Goel  and  Rubin  (1977) 

an  optimal  s  with  |sj  =  t  could  be  derived  directly  from  Eaton's  result. 

In  our  situation  this  is  not  possible  since  now  the  conditional  expected 
loss,  given  U  =  u,  does  not  simply  depend  on  S(u),  but  also  on  u.  Let  us  now 
try  to  find  sufficient  conditions  under  which  the  posterior  risk  at  u  is 

minimal  for  the  set  {k-t+1 , . . . ,k}  among  all  sets  sc  {1 . k)  with  |s|  =  t. 

An  optimal  s  with  |s|  ■  t  minimizes 


(20)  EOnin  EU(en-e.)|U  =  u,  V)  |  U  =  u} 

jes  J 


k 

=  /  min  /  t(en-eJ  n  f  (u..)gfi  (v.)dT(e)dv  e(u), 
Rk  jes  nk  0  J  i=l  ei  1  ei  1 


^  -1 

where  e(u)  =  (/  n  fa  (uJdT(e))  is  of  no  relevance  for  our  problem 
„k  1=1  ei  1  - 

and  thus  can  be  ignored  In  the  sequel.  From  the  remark  following  (4)  we 
see  that  the  Integral  on  the  r.h.s.  of  (20)  can  be  rewritten  as 
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A  change  of  variables  =  T(u1.,v^)  (or  v^  =  T(ui,w.),  respectively) 
modifies  (21)  to 


(22)  /  min  /  Me^-e.)  n  ii  (w-)d-r(e)  n  G(u  .T(u.wJ) 

Rk  jes  ck  0  J  i-1  “i  1  -  r-1  r  r  r 


3f(ur,wr) 


aw 


dw. 


Now  we  are  in  position  to  apply  Eaton's  main  result  iteratively,  first 
to  the  inner  integral  (i.e.  the  2nd  stage  scenario),  and  then  to  the  outer 


one  (i.e.  the  1st  stage  scenario).  Let  L. 


(23) 


k  . 


K  be  defined  by 
k 


L  (w)  =  min  /  t(0ft-6.)  n  hft  (w.)dr(e),  w€  F  . 
s  "  j€s  nk  0  J  1-1  0i  1 


Lemma  3.  For  every  we  F  ,  sc  (1 . k)  wi th  |s|  =  t,  i  6  s,  j  e  {l,...,kNs, 

3  =  (s^{i})  u  { j >  and  w^  <  w^  implies  L^(w)  <_  Ls(w). 

Proof:  Let  re  k)  and  we  F  k  be  fixed.  Then  except  for  a 

k  . 

normalizing  factor  depending  on  w,  R r  =  /  *(0o-0r)  n  h6  (w^)dt(e)  can  be 

1-1  1 

viewed  as  being  the  posterior  risk  (w  are  the  given  "observations"  and  e  are 
the  "parameters")  for  decision  { r }  in  a  fixed  size  1  subset  selection  problem 
of  the  type  treated  in  Eaton  (1967).  The  loss  function  hereby  is 

-  L 

L{i}(0)  =  t(0g-0j),  0  6  fi  ,  i  =  l,...,k,  which  clearly  satisfies  the 
monotonicity  and  invariance  properties  (3.4)  and  (3.5)  of  Eaton  (1967). 

Thus  by  his  Lemma  4.1  we  know  that  are  ordered  in  the  reverse 

order  to  w-|,...,wk  .  This  completes  the  proof. 

In  view  of  Lemma  3  we  know  now  that  an  optimal  s  with  |s|  =  t  minimizes 


(24)  /  L  (w)  n  G(u1,T(ui,wi)) 

k  5  1-1  1  11 


aT(ui,wi) 


3W, 


dw, 


which  can  be  viewed  to  be  (except  for  a  normalizing  factor  depending  on  u) 
the  posterior  risk  (u  are  the  "observations"  and  w  the  "parameters")  for 
decision  s  in  a  fixed  size  t  subset  selection  problem  of  the  type  treated  in 
Eaton  (1967).  The  loss  function  Lg(w)  hereby  satisfies  (by  Lemma  3)  the 
monotonicity  property  (3.4)  and,  obviously,  also  the  invariance  property 
(3.5)  of  Eaton  (1967).  Thus  by  his  Lemma  4.1  we  see  that  the  following 
Assumption  (A)  is  sufficient  for  the  existence  of  a  monotone  optimal  s  with 
|s|  =  t. 

Assumption  (A).  The  distributions  are  as  stated  in  Section  1, 
and  the  function  G(u,T(u,w))  3l^k-n)  |  (g,n)=(u,w)  »(u»w)  6  t(u,T(u,v))|  (u,v)€  fl), 

has  MLR. 

Theorem  4.  Under  Assumption  (A),  for  every  loss  structure  L  of  type 

(18)  and  every  permutation  invariant  prior  distribution  t,  there  exists  a 

B  B 

2-stage  Bayes  procedure  (S  ,d  )  which  is  monotone. 

It  is  now  of  interest  to  find  simple  sufficient  conditions  for 
Assumption  (A)  to  hold  true.  For  exponential  families  we  get  the  following. 


Assumption  (B). 

The  underlying  distributions  for  all  observations  from  ^,...,1^ 
belong  to  an  exponential  family  with  densities  a(e)b(x)exp(ex},  x€  R, 

0  €  n,  w.r.t.  the  Lebesgue  measure  on  R  ,  where  the  function  b(x),  x  6  R  , 
Is  log-concave  ( 1 . e.  the  densities  are  strongly  unimodal.) 


Theorem  5.  Assumption  (B)  implies  Assumption  (A), 
n  m 


Proof:  Let 


Ui-  l  X1Jt  V  1  Yjj  and  Wi  *  U^+Vj,  1  ■  l,...,k. 
jal  j"l 
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Thus  we  have  T(u,v)  =  u+v  and  T(u,w)  =  w-u,  u,v,w  €  F  .  The  density  of 
U.  is  a(ei )nb*n(u)exp{6iu}, u  €  F,  and  the  density  of  is  a(ei)mb*m(v) 
expfe^v},  v  €  ]R  ,  i  =  l,...,k,  where  b*n(b*m)  denotes  the  n-fold  (m-fold) 
convolution  of  b  with  respect  to  the  Lebesgue  measure  on  F  .  It  follows 
that 

(25)  G(u,T(u,w))  -5T^,n-)  =  b*n(u)b*m(w-u) ,  u.w  €  F. 

(5,n)»(u,w) 

Let  the  function  b(x),  x  €  F  ,  be  log-concave.  Then  by  Ibragimov 
(1956),  the  function  b*m(x),  x  6  F,  has  the  same  property.  But  this  is 
equivalent  for  b*m(w-u)  to  have  MLR  inw  6  F  w.r.t.  u  6  F  (cf.  Lehmann 
(1959),  p.  330),  and  therefore  it  is  also  equivalent  for  Assumption  (A)  to 
hold  true. 

Remark  3.  It  is  not  difficult  to  see  that  in  the  general  case  the 
following  conditions  are  sufficient  for  Assumption  (A)  to  hold  true: 

T(u,v)  =  e^u  +  e2v,  u,v  €  F,  >  0,  and  ig0>een  is  a  family  of  log- 

concave  (i.e.  strongly  unimodal)  densities.  This  follows  directly  from  the 
factorization  identity  fa(u)gn(v)  =  h.(T(u,v) )G(u,v) ,  u,v  €  F  ,  0  G  n. 

For  the  remainder  of  this  section  let  us  consider  the  normal  case  in 
more  detail.  Here,  Assumption  (B)  is  satisfied  with  b(x)  =  exp{-x2/202} , 
x  €  F,  and  thus  Theorem  4  is  valid  in  this  case.  Let  us  assume  that  apriori 
0]».-..Gk  are  independently  identically  distributed  random  normals  with  mean 
eQ  and  variance  r  >  0.  Then  at  u  e  Fk  with  ^  <.  ..<  uk  the  optimal  procedure 
selects  at  Stage  1  in  favor  of  the  smallest  value  in  the  following  scheme. 


s 

Posterior  risk  at  u  €  1R  with  u-j  <...<  u^ 

0 

0 

{ k  > 

E°[t(a2(°o-Uk)  +  c^Qq)] 

{k-t+1 , . . . ,k) 

tc  +  E[  min  E°(«.(o?(en-u-)  +  a~Q.  +  a.Qn))],  t  > 
j>k-t+l  c  u  J  J  J  *  u 

where  Q^,Q^ , . . . ,Q^  are  auxiliary  random  variables  which  are  independent 

standard  normals,  E°  denotes  expectation  w.r.t.  Qq,  and 

=  (rp(p+r)_1)1/2,  a2  =  r(p+r)_1 ,  «3  =  pr[(p+r)(pq+pr+qr)]"1/2, 

1  /2 

«4  =  [rpq/(pq+pr+qr)]  . 

Similar  to  what  was  done  by  Goel  and  Rubin  (1977),  let  us  show  next 

k 

that  the  Bayes  solution  at  u  £  IR  with  u^  <_. .  .<_  can  be  determined  in  the 
following  short  way.  Let  rt  denote  the  posterior  risk  for  decision 
s  =  (k-t+1,..., k},  t  =  0,1,..., k.  At  first  one  compares  rQ,  r] ,  r?.  If 
ro(rl )  is  the  minimurn  then  the  final  decision  is  s  =  0  (Ik)).  If  r^  <  rQ,  r 
then  rOJ  r,,...  are  successively  computed  until  r.  <_  r .  ,,  occurs  for  the 

t  3  '0  '0 

first  time;  then  s=f k-i Q+1 , . . . , k >  is  the  final  decision.  This  method  is 
justified  by  the  following  result. 

Lemma  4.  Let  u  €  IR  with  u-|  <_. .  .<  u^  be  fixed  and  let  rg,  r-j , . . .  .r^  be 
defined  as  stated  above.  Then  r2  -  r3  >_  r3  -  r4  !•••!  rk-l"rk' 


Proof:  Let  Z^,...,Zk  be  random  variables  defined  by 

Zj  =  +  a3^j  +  1  =  Then  by  ui  <•••£  u|c 

and  the  fact  that  i  is  non- increasing  we  have  Z,  ±  Z?  <  ...  i  Z.  (where 

'  St  St  st 

"<  "  denotes  stochastic  ordering).  Moreover,  r.  =  tc  -  E(  max  (Z.)), 

?t  1  j>k-t+l  J 

t  =  2,...,k.  Thus,  for  t  >_  2,  by  Chernoff  and  Yahav  (1977)  we  get 
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r.  -  r.  ,  =  E(  max  Z.)  -  E(  max  Z.)  -  c 
1  11  j>k-t  3  j>k-t+l  J 

=  /  n  P{Z.  £  x}  P{Z.  .  >  x}  dx  -  c, 

R  j>k-t+l  3  K"z 

which  clearly  is  non-increasing  in  t,  t  =  2,...,k-l. 

Let  us  finally  take  a  brief  look  at  the  special  case  of  a  linear  loss 
function  i(g)  =  a£»  £  €  1R  ,  a  >  0,  where  we  can  choose  a  =  1  (since  other 
values  of  a  can  be  compensated  in  the  cost  c).  Then  the  decision  at 
Stage  1  is  based  on  the  following  scheme. 

k 

s  Posterior  risk  at  u  6  IR  with  u-j  <...<_  u^ 

0  0 

{k}  a2(e0-uk) 

{k-t+1 . ,k}  a9(en-ub)+tc  -  a7E(  max  (u,-Ub+a,aI1Q.)),  t  >  2. 

*  u  K  j> k-t+1  J  K  3  c  3 

Lower  and  upper  bounds  for  the  expectations  in  this  scheme  can  be 

found  in  Miescke  (1979b)  to  approximate  the  Bayes  procedure.  Note  that  if 

for  a  t  6  {2,...,k}  tc  £  E(  max  Qj,  then  at  most  t-1  popu‘aJ.tons 

13  j=l,...,t  J 

-1/2 

are  selected  at  the  first  stage.  Thus  in  the  case  of  2c  £  the 

Bayes-procedure  is  of  the  type  Pg  (cf.  (16)).  And  for  the  case  of  k  =  2 
populations  the  Bayes-procedure  is  of  the  type  PA  (cf.  (17)),  except  for 
an  area  in  the  neighborhood  of  (9g,eg)  where  t(ie  Bayes-procedure  selects  both 
populations. 
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